Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contact Form: Revised the email validation regex to include length limit #9323

Merged
merged 2 commits into from
Oct 9, 2019

Conversation

stevenlinx
Copy link
Contributor

@stevenlinx stevenlinx commented Apr 16, 2018

Fixes #3734

Changes proposed in this Pull Request:

  • Revised the email validation regex to include length limit, in order to match the latest practical implementation of RFC 5322 in regular-expressions.info/email.html.

  • Regex used:

PHP version: the very bottom one on regular-expressions.info/email.html. The one after the sentence "...modern regex flavors aren't truly regular, so we can add length limit checks using lookahead like we did before:" Take note, the redundant pink-colored white spaces in the middle of the regex must be removed.

\A(?=[a-z0-9@.!#$%&'*+/=?^_`{|}~-]{6,254}\z)(?=[a-z0-9.!#$%&'*+/=?^_`{|}~-]{1,64}@)[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:(?=[a-z0-9-]{1,63}\.)[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?=[a-z0-9-]{1,63}\z)[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\z

JavaScript version:

^(?=[a-z0-9@.!#$%&'*+/=?^_\`{|}~-]{6,254}$)(?=[a-z0-9.!#$%&'*+/=?^_\`{|}~-]{1,64}@)[a-z0-9!#$%&'*+/=?^_\`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_\`{|}~-]+)*@(?:(?=[a-z0-9-]{1,63}\.)[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?=[a-z0-9-]{1,63}$)[a-z0-9](?:[a-z0-9-]*[a-z0-9])?$

Testing instructions:

1.) Checkout the code in this PR on a test WP site.
2.) Use the Contact Form and enter some test data in each field.
3.) In the email field, test different email addresses that are expected to pass and fail.
4.) Use an email you've access to, submit the form and make sure you get the email.
5.) Revert the code back to its previous state.

Proposed changelog entry for your changes:

  • Contact Form: Revised the email validation regex to include length limit.

@oskosk oskosk added this to the 6.1 milestone Apr 17, 2018
@jeherve jeherve added [Type] Enhancement Changes to an existing feature — removing, adding, or changing parts of it [Feature] Contact Form [Status] Needs Review To request a review from Crew. Label will be renamed soon. labels Apr 17, 2018
@stevenlinx stevenlinx changed the title Contact Form: Revise the email validation regex to add length limit Contact Form: Revised the email validation regex to add length limit Apr 17, 2018
@stevenlinx stevenlinx changed the title Contact Form: Revised the email validation regex to add length limit Contact Form: Revised the email validation regex to include length limit Apr 17, 2018
@oskosk oskosk added [Status] Ready to Merge Go ahead, you can push that green button! and removed [Status] Needs Review To request a review from Crew. Label will be renamed soon. labels Apr 24, 2018
oskosk
oskosk previously approved these changes Apr 24, 2018
Copy link
Contributor

@oskosk oskosk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

Copy link
Member

@dereksmart dereksmart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not validating properly.

It's even checking for literal spaces right now: https://regex101.com/r/MAOFqc/4

var newRe = /(?=[a-z0-9@.!#$%&'*+/=?^_`{|}~-]{6,254}\z)  (?=[a-z0-9.!#$%&'*+/=?^_`{|}~-]{1,64}@)  [a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)* @ (?:(?=[a-z0-9-]{1,63}\.)[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+   (?=[a-z0-9-]{1,63}\z)[a-z0-9](?:[a-z0-9-]*[a-z0-9])?/i;

var oldRe = /[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?/i;

newRe.test( 'email@gmail.com' );
false
oldRe.test( 'email@gmail.com' );
true

@dereksmart dereksmart added [Status] Needs Author Reply We would need you to make some changes or provide some more details about your PR. Thank you! and removed [Status] Ready to Merge Go ahead, you can push that green button! labels Apr 24, 2018
@oskosk oskosk dismissed their stale review April 24, 2018 14:48

When I tested, I made a mistake and tested with the minified version (which doesn't include the changes proposed here)

@dereksmart dereksmart removed this from the 6.1 milestone Apr 24, 2018
@stevenlinx
Copy link
Contributor Author

stevenlinx commented Apr 26, 2018

I think I got what was causing the issues. The regex was correct but two things must be done to apply it to javascript, due to regex flavors:

1.) The \A and \z must be either be replaced (with caret ^ and dollar $) or removed (if the subject string is the email string) since javascript does not support \A and \z, but only ^ and $.

For the previous regex on the website, there were only \A and \z at the beginning and end.

Unlike the previous regex, the latest regex version on the website (the very bottom one) not only has \A and \z in the beginning and end, but also \z in the middle (due to lookahead). In my previous commit, I only removed \A and \z in the beginning and end, but not all the \z the middle.

2.) The pink-colored white spaces in the middle of regex must also be removed, which was part of web page layout. In the previous regex, there were no white space that need to be removed.

I've push a new commit with the revised regex.

=====

Tests:

user@example.com
https://regex101.com/r/sGpumM/1

local part (64 char.):
https://regex101.com/r/AZTNg9/1
expected: pass
result: pass

local part (65 char.):
https://regex101.com/r/AZTNg9/2
expected: fail
result: fail

part of domain (63 char.):
https://regex101.com/r/AZTNg9/3
expected: pass
result: pass

part of domain (64 char.):
https://regex101.com/r/AZTNg9/4
expected: fail
result: fail

total length (254 char.):
https://regex101.com/r/AZTNg9/5
expected: pass
result: pass

https://regex101.com/r/AZTNg9/6
expected: fail
result: fail

@dereksmart dereksmart added [Status] Needs Review To request a review from Crew. Label will be renamed soon. and removed [Status] Needs Author Reply We would need you to make some changes or provide some more details about your PR. Thank you! labels Jun 20, 2018
@brbrr brbrr added this to the 6.5 milestone Aug 21, 2018
@oskosk
Copy link
Contributor

oskosk commented Aug 23, 2018

@stevenlin-x can you please clarify in this PR's description which expression is the one you're taking as appropriate from https://www.regular-expressions.info/email.html and then, right next to it, the JS version you're implementing.

Some time has passed and there's a lot of regex translation happening

@oskosk oskosk added [Status] Needs Author Reply We would need you to make some changes or provide some more details about your PR. Thank you! and removed [Status] Needs Review To request a review from Crew. Label will be renamed soon. labels Aug 23, 2018
@abidhahmed abidhahmed modified the milestones: 6.5, 6.6 Aug 28, 2018
@stevenlinx
Copy link
Contributor Author

stevenlinx commented Sep 2, 2018

I've added the regex of both PHP and JS versions to the PR description.

@jeherve jeherve added [Status] Needs Review To request a review from Crew. Label will be renamed soon. and removed [Status] Needs Author Reply We would need you to make some changes or provide some more details about your PR. Thank you! labels Sep 5, 2018
@oskosk oskosk dismissed dereksmart’s stale review September 11, 2018 15:43

Feedback was addressed

@stale

This comment has been minimized.

@brbrr
Copy link
Contributor

brbrr commented Jan 22, 2019

I played a bit with this PR. I made a simple jsFiddle with some valid and invalid emails according to Wiki and this post

https://jsfiddle.net/f9dkotb8/1/

Here are the results:

====================
Testing valid emails
====================
email@example.com isValid: true
firstname.lastname@example.com isValid: true
email@subdomain.example.com isValid: true
firstname+lastname@example.com isValid: true
email@123.123.123.123 isValid: true
email@[123.123.123.123] isValid: **false**
"email"@example.com isValid: false
1234567890@example.com isValid: true
email@example-one.com isValid: true
_______@example.com isValid: true
email@example.name isValid: true
email@example.museum isValid: true
email@example.co.jp isValid: true
firstname-lastname@example.com isValid: true
much."more unusual"@example.com isValid: **false**
very.unusual."@".unusual.com@example.com isValid: **false**
very."(),:;<>[]".VERY."very@\\\ "very".unusual@strange.example.com isValid: **false**

====================
Testing invalid emails
====================
plainaddress isValid: false
#@%^%#$@#$@#.com isValid: false
@example.com isValid: false
Joe Smith <email@example.com> isValid: false
email.example.com isValid: false
email@example@example.com isValid: false
.email@example.com isValid: false
email.@example.com isValid: false
email..email@example.com isValid: false
あいうえお@example.com isValid: false
email@example.com (Joe Smith) isValid: false
email@example isValid: false
email@-example.com isValid: false
email@example.web isValid: **true**
email@111.222.333.44444 isValid: **true**
email@example..com isValid: false
Abc..123@example.com isValid: false
"(),:;<>[]@example.com isValid: false
just"not"right@example.com isValid: false
this is"really"not\\allowed@example.com isValid: false

@stevenlin-x do you think we should cover these failing cases too?

@brbrr brbrr added the [Status] Needs Author Reply We would need you to make some changes or provide some more details about your PR. Thank you! label Jan 22, 2019
@brbrr
Copy link
Contributor

brbrr commented Jan 28, 2019

FWIW: I looked into ways to cover this change with unit-tests. Unfortunately, it's not easy(or even possible) to write tests for a specific function for this file due to its structure. We might want to refactor it to make sure every function is testable.

@dereksmart dereksmart removed the [Status] Needs Review To request a review from Crew. Label will be renamed soon. label Feb 11, 2019
@stevenlinx
Copy link
Contributor Author

stevenlinx commented Mar 3, 2019

@brbrr

1.)
In the code samples you provided above, if you replace the regex with the original regex (i.e. the present regex in use in production), it will also result in false positives with those valid and invalid emails samples.

In other words, you didn't read the context of the ticket in full.

2.)
You falsely thought this was a literal RFC 5322 validation exercise when it's not.

The present regex in use was derived from
regular-expressions.info/email.html

Issue #3734 pointed out the regex no longer matches the one on the updated page.

So I simply updated to reflect the character length limits.

3.)
If you think those samples should be addressed, you should ask the author of the regular-expressions.info site or you should open a separate ticket and provide the reasons on why those samples should be addressed.

Or even better, provide the regex you would use.

@jeherve jeherve added [Status] Needs Review To request a review from Crew. Label will be renamed soon. and removed [Status] Needs Author Reply We would need you to make some changes or provide some more details about your PR. Thank you! labels Mar 4, 2019
@stale
Copy link

stale bot commented Jun 2, 2019

This PR has been marked as stale. This happened because:

  • It has been inactive in the past 3 months.
  • It hasn’t been labeled `[Pri] Blocker`, `[Pri] High`.

No further action is needed. But it's worth checking if this PR has clear testing instructions, is it up to date with master, and it is still valid. Feel free to close this issue if you think it's not valid anymore — if you do, please add a brief explanation.

@stale stale bot added the [Status] Stale label Jun 2, 2019
@matticbot
Copy link
Contributor

Caution: This PR has changes that must be merged to WordPress.com
Hello stevenlin-x! These changes need to be synced to WordPress.com - If you 're an a11n, please commandeer, review, and approve D33713-code before merging this PR. Thank you!

@stale stale bot removed the [Status] Stale label Oct 8, 2019
@kraftbj kraftbj added this to the 7.9 milestone Oct 8, 2019
@kraftbj kraftbj added [Status] Ready to Merge Go ahead, you can push that green button! and removed [Status] Needs Review To request a review from Crew. Label will be renamed soon. labels Oct 8, 2019
Copy link
Contributor

@kraftbj kraftbj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the massive delay on this one. While there are other regex things to improve to ensure that all valid e-mails are accepted and all invalid are excluded, this is a step improvement and we shouldn't let perfect be the enemy of good.

Approving and open to future PRs to address the other issues (or investigate using browser-based HTML5 validation instead of the JS solution).

@matticbot
Copy link
Contributor

stevenlin-x, Your synced wpcom patch D33713-code has been updated.

@jeherve jeherve merged commit 65bd4db into Automattic:master Oct 9, 2019
@matticbot matticbot added [Status] Needs Changelog and removed [Status] Ready to Merge Go ahead, you can push that green button! labels Oct 9, 2019
jeherve added a commit that referenced this pull request Oct 23, 2019
jeherve added a commit that referenced this pull request Oct 29, 2019
* 7.9: Changelog

* Update version number

* Update stable tag and tested up to

* Changelog: add #13530

* changelog: add #13578

* Changelog: add #13598

* Changelog: add entry for numerous block preview changes

* Changelog: add #13599

* changelog: add #13541

* Changelog: add #13542

* Changelog: add #13331

* Changelog: add #13558

* Changelog: add #13409

* Changelog: add #13582

* Changelog: add #13600

* Changelog: add #13601

* Changelog: add #13595

* Changelog: add #12695

* Changelog: add #13009

* Changelog: add #13649

* Changelog: add #13450

* Changelog: add #13507

* Changelog: add #13658

* Changelog: add #13687

* changelog: add #13683

* Changelog: add #9323

* Changelog: add #13681

* Fix typos in readme

* Add link to WordPress Beta Tester plugin

* Changelog: add #13630

* Changelog: add #13695

* Changelog: add #13659

* Changelog: add #13716

* Changelog: add #13664

* Changelog: add #13682

* Changelog: add #13362

* Changelog: add #13563

* Add testing list for #13563

* Changelog: add #13735

* Changelog: add #13752

* Changelog: add #13624

* Changelog: add #13756

* Changelog: add #13745

* Changelog: add #13728

* Changelog: add #13779

* Changelog: add #13699

* Changelog: add #13804

* Changelog: add #13761

* Changelog: add #13637

* Changelog: add #13517

* Changelog: add #13521

* Changelog: add #13729

* Testing list: add testing instructions for #13729

* Changelog: add sync changes

* Changelog: add #13807

* Changelog: add #13654

* Changelog: add #13795

* Changelog: add #13801

* Changelog: add #13818

* Changelog: add #13725

* Changelog: add #13831

* Changelog: add #13516

* Testing list: add Twenty Twenty instructions

* Changelog: add #13799

* Changelog: add #13805

* Changelog: add #13688

* Changelog: add #13830
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Contact Form [Status] Tested on WP.com Touches WP.com Files [Type] Enhancement Changes to an existing feature — removing, adding, or changing parts of it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Contact Form: Email validation regex needs updating
8 participants